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ABSTRACT 

We extend the recently introduced Bayesian framework ‘Generative Pulsar Timing Analysis’ 
to incorporate both pulse jitter (high frequency variation in the arrival time of the pulse) and 
epoch to epoch stochasticity in the shape of the pulse profile. This framework allows for a full 
timing analysis to be performed on the folded profile data, rather than the site arrival times as 
is typical in most timing studies. We apply this extended framework both to simulations, and 
to an 11 yr, 10 cm data set for PSR J1909-3744. Using simulations, we show that temporal 
profile variation can induce timing noise in the residuals that when performing a standard 
timing analysis is highly covariant with the signal expected from a gravitational wave (GW) 
background. When working in the profile domain, these variations are de-correlated from the 
expected GW signal, resulting in significant improvement in the obtained upper limits. Using 
the PSR J1909-3744 data set from the Parkes Pulsar Timing Array project, we find significant 
evidence for systematic high-frequency profile variation resulting from non-Gaussian noise in 
the oldest observing system, but no evidence for either detectable pulse jitter, or low-frequency 
profile shape variation. Using our profile domain framework we therefore obtain upper limits 
on a red noise process with a spectral index of y = 13/3 of 1 x 10“'^, consistent with previously 
published limits. 
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1 INTRODUCTION 

The emission of electromagnetic radiation from pulsars in the radio 
band is thought to be the result of highly energetic charged particles 
being accelerated, and then escaping, along open magnetic field 
lines in the pulsar’s magnetosphere. When the magnetic axis is not 
perfectly aligned with the rotational axis, the result is a beam of 
radiation that sweeps across the sky. If an observer lies in the path 
of this beam, they will observe a series of regularly timed pulses, 
giving the pulsar its trademark lighthouse effect. 

The time of arrival (TOA) of these pulses is predicted by the 
pulsar’s ‘timing model’. This is a deterministic description of the 
pulsar, describing its rotational period and spin down, astrometric 
properties, and if the pulsar is part of a binary system, additional 
parameters such as its orbital period, and geometry of the orbit with 
respect to the Earth. 

When a pulsar is observed for the purpose of timing, the in¬ 
dividual pulses from a single observing epoch are folded together 
using some fiducial timing model to form an average pulse pro¬ 
file for that observation. This averaged data can then be compared 
to a model for the pulse profile, a process that results in the cre¬ 
ation of a set of observed TOAs, referred to as site arrival times 
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(SATs). These SATs can then be compared to those predicted by 
the pulsar’s timing model, where the difference between the two 
are known as the ‘timing residuals’. These residuals carry physical 
information about the unmodelled effects in the pulse propagation, 
including intrinsic variation in the pulse emission time (e.g. |Shan^ 
[non et aL]|2014[ [Hobbs et al.||2010t , or extrinsic factors, such as 
perturbations due to gravitational waves (GWs) (e.g. |Sazhin|1978[ 
jPetweilerj 1979k 


The most stringent 95% upper limit on the amplitude of an 
isotropic stochastic GW background (GWB) formed from the in¬ 
coherent superposition of GWs emitted from a large number of 
merging supermassive black-hole binaries is A < 1 x 10“*^ for a 
reference frequency of one yr“' l Shannon et al.|2()T5 henceforth 
SI5). This is equivalent to a red noise process with an rms ampli¬ 
tude of ~ 70 ns in a 10 yr data set. In Fig.[T] we show the effect 
of a 100ns deviation due to a GWB on the arrival time of the aver¬ 
age profile at a single observational epoch. In the PSR J1909-3744 
10cm data set described in Section the largest number of phase 
bins used per epoch is 2048, corresponding to a deviation of only 
~ one tenth of a phase bin. 

Several approaches to building a profile template have previ¬ 
ously been used. For example, a template can be formed by aver¬ 
aging over the individual profiles at each observational epoch to 
create a single high S/N profile that can then be used to form the 
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Figure 1. Noiseless model for the deterministic profile of PSR J1909-3744 
at zero phase (black line), and after being shifted by 100ns, comparable to 
the effect of a passing GW from a 10“*^ isotropic GWB (red line). The 
timespan along the x-axis covers 7.5% of a full rotation for this pulsar. In 
the PSR J1909-3744 10cm data set described in Section|^this corresponds 
to, at best, ~ one tenth of a phase bin. 


TOAs directly. This, however, has the disadvantage that any noise 
present in the template can bias the resulting TOA estimates (|Hotan| 
|et al.|2003| l, especially if the template noise is correlated with noise 
in the profiles. Alternatively, rather than use the averaged data, a 
smoothed version of the template can be used (e.g.|Demorest et al.| 
[20l3l l, or analytic functions can be fit to the averaged profile to 
form a noise free template (e.g. [Manchester et al.|2013) l. 

Once a template has been developed it is then used to form the 
TOAs for each observational epoch. This is most commonly done 
via the ‘Fourier phase-gradient method’ ( |Taylor|1992| l in which the 
phase offset between the two is computed using the Fourier trans¬ 
form of both the template, and the profile at each epoch, and a cross 
correlation between the two performed. Alternative time domain 
approaches have also been used (e.g. [Hotan et al.|2005| l, however 
regardless of the approach, they all share a common assumption; 
that the profile is stable within radiometer noise from epoch to 
epoch. 

While long term stability of pulse profiles has been shown 
in some pulsars (e.g. jShao et aL||2013| ), epoch to epoch variation 
in the profile shape has also been observed. For example, a study 
of morphological variability in PSR J1022+1001 suggests that the 
pulse profile varies at the few per cent level ( [Hotan et al.| |2004[ 
[Liu et al.[2015t , while PSR J0437-4715 has also been observed to 
show timing instability ( [Hotan et al.[2006| (. In both cases the origin 
of the instability could be instrumental, for example, due to polar¬ 
ization calibration errors, or it could be the results of the intrinsic 
stochasticity of the profile. The individual pulses from a pulsar are 
known to show a high degree of variability (e.g. [Hankins & Cordes[ 
|1981[ l, and so as instrumentation improves and radiometer noise de¬ 
creases, this intrinsic stochasticity will unavoidably become more 
significant within a single observation. Profile variability has also 
been observed in young pulsars, where in some cases timing noise 
has been found to be correlated with changes in the pulse shape 
( [Lyne et al.|2010[ l. Pulse profile variability associated with instru¬ 
mental distortions has also been widely observed, particularly in 


jitter-dominated observations of young pulsars or with instruments 
with low-bit digitisation ( [Jenet & Anderson||1998[ l. Typically the 
effects of these distortions have been modelled in the TOA do¬ 
main. This is done both by including additional white noise param¬ 
eters referred to as ‘EFAC’ and ‘EQUAD’, which scale and add in 
quadrature to the formal TOA uncertainties, and by incorporating a 
model for low frequency timing noise into the analysis (e.g. [Col^ 
[et al.[20rT]|van Haasteren & Levin[2013[[Lentati et al.[2014^ . This 

has the disadvantage that in a single pulsar these distortions could 
be covariant with the GW signal (see Eigure|^. 

In the top panel of Fig. we compare simulated residuals in¬ 
duced by the GW signal from an isotropic stochastic background 
with an amplitude of 1 x 10“*^ (black line), with those that result 
from the passage of an additional Gaussian component through 
the profile, with an amplitude of 0.5% that of the observed pro¬ 
file, which is not appropriately modelled by the single average 
profile used to form the TOAs (see bottom panels). All TOAs 
were simulated using the highest signal to noise profile in the PSR 
J1909-3744 data set used in Section|^ resulting in uncertainties of 
20ns for each observation. The two signals are of comparable am¬ 
plitude, implying that any unmodelled profile variation larger than 
this will quickly dominate over a GW signal in the TOAs. 

In [Lentati et al.[ ( [2015) (henceforth LI5) a Bayesian frame¬ 
work was introduced dubbed ‘Generative Pulsar Timing Analysis’ 
(GPTA) that allows for a full timing analysis using the folded pro¬ 
file data, rather than the SATs that result from the cross correlation 
with a profile template. This allowed for analysis of the pulsar’s 
timing model, along with intrinsic stochastic processes such as spin 
noise - low frequency variation in the pulse TOAs - simultaneously 
with a model for the pulse profile, for which a shapelet basis was 
used. 

In this work we extend this framework to incorporate epoch 
to epoch changes in the profile. We include a model for pulse jit¬ 
ter - high frequency stochastic variation in the arrival time of the 
profile model - along with models for variations in the shape of 
the profile, which we obtain by calculating the power spectrum of 
the variance in our shapelet model as a function of scale in phase 
space. While this doesn’t constitute a physical model for the epoch 
to epoch stochasticity, by obtaining the power spectrum of the vari¬ 
ations, we can begin to characterise the shape changes in a statisti¬ 
cally robust manner, ultimately leading to a better understanding of 
their origins. 

In Sections 1^ to 1^ we describe the models used in our pro¬ 
file domain analysis, and how we implement them in our Bayesian 
framework. In Section [6] we describe the lOcm PSR 11909-3744 
data set that we use to construct our simulations described in Sec- 
tion|^ and that we use in our analysis in Section]^ Finally we offer 
some concluding remarks in Section|^ 


2 A PROFILE DOMAIN MODEL 

The methods used in this analysis are drawn from those presented 
in LI5. Here, our pulsar timing analysis is performed entirely with 
the profile data, rather than the TOAs formed from those profiles. 
Qualitatively, in each likelihood calculation, we construct a model 
for the deterministic (or average) profile using a shapelet basis, and 
generate a model time of arrival at each observational epoch for 
that profile using the pulsars timing model. Both these steps oc¬ 
cur simultaneously, such that both the parameters that describe the 
shapelet model, and the timing model parameters are free to vary 
within our analysis. 
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Figure 2. (Top) Black line - Simulated residuals due to a GW signal from an isotropic stochastic background with an amplitude of 1 X 10“*^, consistent with 
the most stringent 95% upper limits set by Shannon et al. (2015). TOAs were simulated using the highest signal to noise profile in the PSR J1909-3744 data 
set used in Section]^ resulting in uncertainties of 20ns for each observation. Red line - Simulated residuals induced by the passage of an additional Gaussian 
component to the profile data (see bottom panels), not included in the template at the time of forming the TOA, with an amplitude of 0.5% that of the observed 
profile. The two signals are of comparable amplitude, implying that any unmodelled profile variation larger than this will quickly dominate over a GW signal 
in the TOAs. (Bottom) 3 examples of the additional Gaussian component at different positions in the main profile, and residuals from the profile fit. 


While a full description of the general framework we will use 
is available in L15, in this work we will be extending the method¬ 
ology significantly to incorporate the possibility of epoch to epoch 
variation in the profile. We include models for pulse jitter - a shift 
in the arrival time of the deterministic profile - as well as shape 
variation that could be of instrumental, or astrophysical origin. As 
such we will include below an overview of the basic framework, 
before providing details on the modifications required to incorpo¬ 
rate profile stochasticity. 

2.1 Shapelets 

A thorough description of the Shapelet formalism can be found in 
Refregier| j2003|, with astronomical uses being described in e.g, 
Kelly & McKay| ( [2004^ ; [Lentati et al.]j2013| l; rRefregier & Bacon| 
1 2003 Here we give only an outline to aid later discussion. 

Shapelets are described by a set of dimensionless basis func¬ 
tions, which in one dimension can be written as: 



where n is a non-negative integer, and H„ is the Hermite polyno¬ 
mial of order n. Therefore the lowest order shapelet is given by a 
standard Gaussian (Hq(x) =1), with higher order terms represented 
by a Gaussian multiplied by the relevant polynomial. 

These are then modified by a scale factor A which is a free 


parameter to be fitted for, in order to construct dimensional basis 
functions: 

B„(x;A) = A-‘/2 ^„(A-'x). (2) 

These basis functions are orthonormal, i.e: 

dx S„(x; A)fi,„(x; A) = d„„„ (3) 

OO 

where S„,„ is the Kronecker delta, so that we can represent a func¬ 
tion x(x) as the sum: 

max 

s{x,(,A)^^(Mx;A), (4) 

n=0 

where are the shapelet amplitudes, and the number of 
shapelet basis vectors included in the model. 

In our analysis of pulsar timing data, we form a single profile 
shape, which will then be scaled from epoch to epoch. We therefore 
redefine Eq.|^ such that we have a single global amplitude A, and 
"max - 1 parameters which are the amplitudes for the shapelet 
components that have n > 0. These therefore represent the relative 
contribution to the overall profile shape, relative to the zeroth-order 
term, which we take to have an amplitude of 1. Written in this way 
Eq. [^becomes: 

Umax 

j(x,A,^,A) = A^^„S„(x;A). (5) 

0=0 
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Finally, the total integrated flux in the model profile, Ftot, is 
given by 


F 


tot 



dx i(x) = A ^ 

n=even 



( 6 ) 


While there is no natural basis in which to describe the shape 
of the pulse profile, shapelets present several advantages when 
compared to, for example, a set of von Mises functions. A shapelet 
model requires a single width parameter, and a set of am¬ 
plitudes which multiply the linear basis vectors given that width. 
Phase wrapped Gaussians, or von Mises functions, however, have 
n widths and n - 1 relative positions, which can become highly 
multi-modal (each component used can switch places while giving 
the same result), and thus computationally expensive when explor¬ 
ing the parameter space. In addition, as we will show in Section]^ 
it is trivial to use shapelets to incorporate additional stochastic el¬ 
ements in the profile, without having to sample numerically over a 
large number of additional dimensions, as we can deal with many 
of the amplitude parameters analytically. 

Finally we note that while in this work we discuss only 
the use of one-dimensional shapelets, in |Refregier| ( |2003| l a two- 
dimensional shapelet basis is also described. In the context of pro¬ 
file domain pulsar timing analysis, the use of two-dimensional 
shapelets can readily be seen as an extension to the methodology 
we will describe below. While we will consider only radio data that 
has been fully integrated over frequency, modem observing sys¬ 
tems observe wide bandwidths. Two dimensional template fitting 
that incorporates a model for both the frequency dependence of 
the profile across the band (e.g |Pennucci et al.| ( |2014| l; |Liu et ak] 
( |2014^ ), and the broadband component of pulse jitter, could fur¬ 
ther improve the timing precision obtainable via a profile domain 
analysis. We will investigate this possibility in future work. 


2.2 Evaluating the timing model 

We begin by considering our data, d, a set of Nj integrated pulse 
profiles, where the profile at observational epoch i consists of a set 
of Ni values representing the flux density of the profile as measured 
at a set of times t,. We represent d as a sum of both a deterministic 
and a stochastic component: 

d = s -I- n, (7) 

where d represents the Y!i=i points for the Nj pulse pro¬ 

files for a single pulsar, with s and n the deterministic and stochas¬ 
tic contributions to the total respectively, where any contributions 
to the latter will be modelled as random Gaussian processes. 

We first consider the stochastic component of the signal, n, to 
be described solely by an uncorrelated random Gaussian process 
with RMS (T,, determined individually at each epoch i. The deter¬ 
ministic component, s, consists only of (i) our shapelet model for 
the pulse profiles with the centroid position for each model pro¬ 
file determined using the set of arrival times r(e) predicted by the 
pulsar’s m timing model parameters, e, and (ii) an arbitrary base¬ 
line offset for each profile epoch. With the inclusion of the timing 
model, we can rewrite Eq.|^to describe the shapelet model for a 
particular epoch i as: 

^max 

i,(r,A,^,A,6) = A^^„S„(f-T,(e);A), (8) 

n=0 

with Ti the barycentric arrival time for pulse i predicted by the set 
of timing model parameters e. 


In LI5 it was assumed that the correction to the SSB for each 
profile could be computed at the SAT obtained previously using 
traditional techniques for that profile. In our analysis of the PSR 
J1909-3744, the inclusion of phase offsets between dilferent ob¬ 
serving systems (referred to as ‘Jumps’) means that we must first 
correct the SATs using the values for these offsets, and the overall 
phase offset. We then compute the correction to the Solar System 
Barycenter at these new values. This is simply because the correc¬ 
tion due to the Roemer delay can vary by ~ 100 ns across the set of 
times tj that span the pulse period of the pulsar, which is significant 
in the context of the highest precision observations available today. 

We can then write the likelihood that the data is described only 
by the shapelet parameters, 6 = (A, A), the timing model param¬ 

eters e, and the baseline offset, y,, for each epoch i as: 


Nd 


Pr(d|e, e) oc 


VdetNi 


(9) 


X exp 


__(d - Si(0, e) - y,)^Nr‘(d - Si(0, e) - y,) 


where N, is the white noise covariance matrix for epoch i, with 
elements = CTiSjk. 


3 INCLUDING ADDITIONAL STOCHASTIC 
PARAMETERS 

3.1 Radiometer noise 

In Eq. the amplitude of the radiometer noise in the profile data is 
taken to be a known quantity at each epoch. In principle we would 
want to include cr, as a free parameter for each epoch, however 
this is not currently computationally tractable for large data sets. In 
our analysis we can obtain a good initial estimate of the radiometer 
noise from the off-model region of the profile data, where the am¬ 
plitude of the model profile is less than 0.1%. As in TOA domain 
analysis we can then include additional parameters that help ac¬ 
count for our uncertainty in this estimate, known as EFAC, which 
we denote a, which scales our initial estimate of the radiometer 
noise for all epochs associated with a given observing system. 

Our value for the radiometer noise then becomes: 

= (aiO-ijf- (10) 


3.2 Profile Jitter 

We define profile jitter in our model as a shift in the arrival time 
relative to those predicted by the pulsar’s timing model of the deter¬ 
ministic profile that is uncorrelated between different observational 
epochs. This could be the result of systematic effects, or intrinsic 
high frequency stochasticity in the emission time of the pulse (see 
Fig. a e.g. Shannon et al. (2014)). In the latter case, even if we as¬ 
sume that the mean profile is the same at each epoch, the observed 
arrival time could still be subject to a shift due to this stochasticity 
in emission (see Fig. |^. Regardless of its origin, in the TOA do¬ 
main this is typically modelled with an EQUAD parameter, which 
adds in quadrature to the formal TOA uncertainty. In the profile do¬ 
main we can directly incorporate a model that describes a stochastic 
shift in the mean profile. In our model we assume that the jitter am¬ 
plitudes at each epoch can be described by a Gaussian distribution, 
where the variance of that distribution is a free parameter in our 
analysis. 
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Figure 3. (Bottom) A series of simulated single pulses with a Gaussian 
profile, and a high frequency stochastic element to the pulse emission time. 
(Top) The mean profile that results from averaging the single pulses. Even 
if we assume that the mean profile is the same at each epoch, the observed 
arrival time could still be subject to a shift due to this stochasticity in emis¬ 
sion. In the TOA domain this is typically modelled with an EQUAD param¬ 
eter, which adds in quadrature to the formal TOA uncertainty. In the profile 
domain we can directly incorporate a model that describes a stochastic shift 
in the mean profile. 


If we write the amplitude of the shift due to profile jitter at 
any epoch i as Sh, we can modify Eq. [^trivially to include this new 
parameter as: 

^>niax 

Si{i,A,^,A,e,6ti) = - Ti(e) - Str,A). (11) 

n=0 

If this was the only modification to our analysis method, clearly 
introducing a shift parameter per epoch would result in complete 
covariance with the timing model parameters, it would be equiv¬ 
alent to introducing a jump in between each TOA. We will later 
include a prior on these shift amplitudes that constrains them to 
come from a Gaussian distribution, and will show in the context of 
simulations, that when the jitter model is appropriate, this method 


produces results that are completely consistent with the simulated 
jitter amplitudes, and constraints that are superior to those obtain¬ 
able in the TOA domain, with no loss in timing precision. 

Written as in Eq. we have to include Nj free parameters 
describing the jitter in our analysis, which, as for the overall pro¬ 
file amplitudes would rapidly become intractable for large data 
sets. An advantage of the shapelet basis, is that for any arbitrary 
shapelet model we can analytically expand Eq.[TT]for small ampli¬ 
tudes Stj relative to the width parameter A. This allows us to lin¬ 
earise the shift, and so marginalise analytically over the individual 
amplitudes, a process we will describe in Section A complete 
derivation of this linear translation operator, along with operators 
for other transformations, can be found in |Refregier| j2003^ . The 
result is that, for any profile model Si(t, A, e), we can write down 
a ‘jitter profile’ as: 

c. ^max 

ji(t,Sti,^,A,e) = -^^^„(Vn6„(?-T,(e);A) (12) 

V2A ^^0 

- 47^B„^,(i-T,{ey,A)). (13) 

For example, in the case where n„,„^ = 0, such that we only 
include the Gaussian term in our model, our jitter profile will be 
given by: 


ji(t, 6ti, A, e) 


-^(oi-BAt-T.(ey,A)) 

V2A 


2A 


^o(f - T,(e)) exp 


1 (t-Tiie)y\ 

2 A2 j 


(14) 

(15) 


which is equivalent to expanding exp ^ (' nW ft,) j small dtj. 
We illustrate this example in Eig.|^ 

Our shifted shapelet model will then be given by: 


(f. A, Stj, f, A, e) = Si{t, A, A, e) -l- _/,(?, dr,-, A, e). 


(16) 


As mentioned previously, we must include a constraint on the 
amplitudes of the shift parameters (5t. We do this by defining a prior 
on the amplitude parameters that is equivalent to fitting for the vari¬ 
ance of the distribution. This variance can be defined across an ar¬ 
bitrary set of observational epochs, for example, to model the vari¬ 
ance separately as a function of observing system, or for all epochs 
simultaneously. 

The covariance matrix of the jitter amplitudes, which we de¬ 
note J, is written: 

J.j = (Stidtj) = (17) 

where the set of coefficients represent the theoretical variance of 
the jitter model at epoch i. 

We normalise the amplitudes 6tj at each epoch such that they 
can be described by a single variance JT; for all observations in 
units of seconds. We cannot simply use the maximum likelihood 
amplitude of the profile model however, as if we believe our data to 
be affected by jitter, this amplitude will not be correct, which will 
decrease our sensitivity to the amplitude of jitter in the data set. 

We therefore define the scaling factor C, such that: 

Fa 

Q = (18) 

Ftotj 

where Fim,, is the total flux in the unsealed model profile at epoch 
i, and so is independent of the overall model amplitude, and 
is an estimate of the total signal flux in the data at epoch i, which 
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Figure 4. Example of the linear translation operator for the lowest order 
shapelet coefficient. The black solid curve represents the model profile at 
the anival time predicted by the pulsar’s timing model. The red line is the 
jitter profile defined in Eq. (Tg. Adding the jitter profile to the deterministic 
profile results in a shift by an amount St (black dotted line). We stress that 
in addition to the shift amplitudes representing jitter, we enforce a strong 
additional constraint in the form of our prior in Eq. {24) . All jitter ampli¬ 
tudes are constrained to come from a Gaussian distribution, whose variance 
is free parameter in our analysis. We will show using simulations in Section 
[7] that when the stochastic component of the data really is well described by 
a shift in the aiTival time, this profile domain model gives completely con¬ 
sistent constraints to those obtained in the TOA domain using an EQUAD 
parameter. 


we calculate by integrating over the bins where the model profile is 
greater than 0.1% of its total amplitude. 

We can then write our joint probability density for the individ¬ 
ual jitter amplitudes, and the variance parameters as: 


Pr(d|6/, e, 6UJ) = Pr(d|6>, 6, Si) x PT(Si\J) (19) 


where Pr(d|0,6, i5t) has the same functional form as in Eq. but 
with the the shapelet profile model being replaced with our shifted 
model in Eq. i |16| i, and Fr(St\J') is given by: 


PrC^tlJ") oc ^ exp 

VdeTJ 


2 


( 20 ) 


3.3 Profile Stochasticity 

In addition to our jitter model, we also include epoch to epoch vari¬ 
ation in the shape of the deterministic profile. This could be the re¬ 
sult of systematic effects, such as errors in polarization calibration, 
or due to intrinsic shape changes in the profile, where the radiome¬ 
ter noise for a particular observing system is below the level of the 
intrinsic stochasticity. 

We define this stochasticity in two regimes, high and low fre¬ 
quency fluctuations in the profile shape, where by frequency, we 
refer to the scale of the fluctuations in phase space. 


3.3.1 High frequency stochasticity 

Any stochasticity in the profile that occurs on scales of less than the 
bin width in phase space will present itself in any one epoch simply 


as an increase in the uncorrelated noise in the on pulse region of 
the profile. In the most general case, we can define a ‘stochastic 
envelope’, which represents the increase in the radiometer noise 
due to this process such that: 


£,(r, Se, e) 


-CiSpit, Be,e), 


( 21 ) 


where 6^ are the parameters that describe the shape of the envelope, 
Tj is the integration time for the observation at epoch i, and Cj is the 
normalisation constant defined in Eq.[T^ We then add this stochas¬ 
tic profile to our instrument noise <t, such that the new variance in 
a bin j for observational epoch i is given by: 


ajj = {UiCTijf + Si(t, e„ e)]. 


( 22 ) 


While this is the most general case, such that the shape of the en¬ 
velope is free to vary in the analysis, we will assume a simpler 
model so that the stochastic envelope has the same shape as the 
mean profile, and is proportional to the amplitude of the profile 
in any one observation. We note that giant pulses observed in both 
canonical pulsars, and MSPs (e.g.|Wolszczan et al.|jl984|;|Sallmen| 
|et al.| ( [T999| l; |Romani & Johnston| l [2001[ l) show structure on ~ 10ns 
scales, and have been observed to be localised in phase space. In 
these cases we would expect that our simple model for high fre¬ 
quency profile stochasticity would break down. In principle, how¬ 
ever, one could model such fluctuations using a non-Gaussian prob¬ 
ability density function (e.g. |Lentati et al.H2014^ ), or by folding in 
prior information about the structure of such pulses, though this is 
not an approach we consider here. 

We therefore define a scaling parameter [3 which represents the 
increase in the variance, and redefine £ such that: 


S,(t,9,e)=j3J^C,Sp{t,e,e). 

V ^ i 


(23) 


3.3.2 Low frequency stochasticity 

To model low-frequency stochastic shape changes in the profile we 
use the same shapelet basis as for the mean profile itself. Our goal is 
then to robustly determine the power spectrum of the shape changes 
as a function of the scale (or order) of the component in the model. 

We therefore define the set of shape variation power spectrum 
coefficients S, such that for a given scale i the covariance matrix, 
which we denote S, will be given by: 

S= ((^, - iMj - Ij)) = (24) 

Cd.i 

where is the mean value of the shapelet coefficient fj, which in 
our analysis we take to be the values that describe the determinis¬ 
tic profile. The normalising factor l/P^.,, with Pd.i the power in the 
on pulse region at epoch i, is included to give us the power spec¬ 
trum coefficients Sj in units of the fraction of the total power in the 
profile. 

In any one observational epoch i, we can therefore write the 
sum of the deterministic, and stochastic components of the profile 
as: 

s'(t, 9, £, f) = sft, 9, e) + Sj(t, 9, e). (25) 

We can then write our joint probability density for the individ¬ 
ual jitter amplitudes, and the variance parameters as: 

Pr(d|e,^,e,.S) = Pr(d|0,e,^)xPr(^|.S) (26) 
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where Pr(d|6, e, has the same functional form as in Eq. but 
with the the shapelet profile model being replaced by the sum of 
the mean, and stochastic profile models as in Eq. ( |25^ , and Pr(f |<S) 
is given by: 


Pr(f I.S) « 


VdetS 


exp 


2 


(27) 


4 MARGINALISING ANALYTICALLY OVER THE 
LINEAR PARAMETERS 

Ignoring any additional stochastic parameters, our model for the 
pulse profile so far contains a set of — 1 shapelet amplitudes, 
which are defined across all observing systems that operate across 
some bandwidth, and some period of time and describe the shape 
of the mean profile. At any observational epoch however, effects 
such as scintillation in the interstellar medium can cause both the 
effective band-integrated shape, and the amplitude of the profile to 
vary significantly ( |Rickett|[l969^ . As such, the overall amplitude 
must be free to vary from one epoch to another, and in addition, the 
baseline offset must also be treated as a separate parameter for each 
epoch. 

When including stochastic parameters we then have an addi¬ 
tional amplitude parameter per epoch that describes the shift in the 
arrival time due to pulse jitter, as defined in Section |3.2[ as well 
as a set of amplitude parameters that describe the low frequency 
stochastic variation in the pulse shape for that epoch as described 
in Section [3.3.2l 

In principle this introduces a large number of additional free 
parameters, however we can marginalise analytically over all of 
these amplitude parameters without introducing large matrix op¬ 
erations and thus significantly decrease the size of parameter space 
we must sample over numerically. 

If the number of amplitude parameters we wish to marginalise 
over analytically for any epoch i is An, we define a x A,- ma¬ 
trix, which we denote P, such that, for example the profile overall 
amplitude parameter and offset amplitude would be included as: 


= 1 

(28) 

with the remaining basis vectors describing pulse jitter and profile 
variation being included in rows - We then define the diago¬ 

nal, An,x An, matrix Ti, which combines our priors for the jitter and 
low frequency stochastic parameters, and is zero for the elements 
corresponding to the overall amplitude and baseline offset. Finally 
we define the length-A^, vector bj, which contains the amplitude 
parameters that multiply our basis vectors in Pi, which allows us to 
write our likelihood as: 


•■'a 

Pr(d|e,e,a,^)Pr(^|.S)Pr(dtm cv 


VdetNidePPi 


(29) 


exp 


-(di-Pibi)^Nr‘(di-Pibi) 


exp 


--bi^Ti ‘bi 


In order to perform the marginalisation over the amplitudes 
bi, we define PfNr'Pi + T'i^' as Sj and PrNr'di as dj and then 


integrate over the elements in bj analytically to get the likelihood 
that the data is described by the remaining parameters alone. 

Our marginalised probability distribution for the remaining 
parameters alone is then given by: 


Pr(0,e,.S,JT|d) cc 


X 


n 


f=i 


detff^ 

Vdet(Ni) 




2 V ■ 1 



(30) 


5 BAYESIAN ANALYSIS METHODS 


Bayesian Inference provides a consistent approach to the estimation 
of a set of parameters 0 in a model or hypothesis “H given the data, 
D. Bayes’ theorem states that: 


Pr(0 \D,<H) = 


Pr(£) I 0,‘H)Pr(0 | <H) 
Pr(Z) I K) 


(31) 


where Pr(0 | D, 'H) = Pr(0) is the posterior probability distribution 
of the parameters, Pr(D | 0, Tf) = L(0) is the likelihood, Pr(0 | 
"H) = ;r(0) is the prior probability distribution, and Pr(D | "H) = Z 
is the Bayesian Evidence. 

For model selection, the evidence is key, and is simply the 
factor required to normalise the posterior over 0: 


Z = j L(0);r(0)d"0, (32) 

where n is the dimensionality of the parameter space. 

As the average of the likelihood over the prior, the evidence 
is larger for a model if more of its parameter space is likely and 
smaller for a model where large areas of its parameter space have 
low likelihood values, even if the likelihood function is very highly 
peaked. Thus, the evidence automatically implements Occam’s ra¬ 
zor: a simpler theory with a compact parameter space will have a 
larger evidence than a more complicated one, unless the latter is 
significantly better at explaining the data. 

The question of model selection between two models “Tfo and 
'Hi can then be decided by comparing their respective posterior 
probabilities, given the observed data set D, via the posterior odds 
ratio R: 


P(Pi I D) ^ PjP I PPPiPi) ^ Zi PiHi) 
P(Ha I D) P{D I Ha)P(Ho) Zo P(Ho) ’ 


where P{Hi)IP(Ho) is the a priori probability ratio for the two 
models, which can often be set to unity but occasionally requires 
further consideration. 

The posterior odds ratio then allows us to obtain the probabil¬ 
ity of one model compared with the other, simply as: 


P = 


R 

l+R' 


(34) 


Typically in our analysis we deal with the log odds ratio, 
which is then simply the difference in the log Evidence for two 
competing models. T able lists a com mon interpretation of the 
Bayes factor given by |Kass & Raftery^ ( |1995[ l. In all our analysis 
we will consider a difference in the log Evidence between models 
of 3 to be the minimum required to justify the addition of extra 
model components. 
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Figure 5. (Left) Post-fit residuals for PSR J1909-3744 after subtracting the maximum likelihood timing model without including any additional stochastic 
parameters in the TOA domain analysis. The weighted RMS is 112ns with a reduced of 1.7. Colours represent different observing systems: (black) 
WBCORR, (red) PDFBl, (green) PDFB2, (dark blue) PDFB4, (light blue) PDFB4a. (Right) Example profile data for PSR J1909-3744 from four different 
observational epochs, each with a different observing system. The different overall fluxes, offsets and noise levels in each are apparent. 


Table 1. Interpretation of the evidence 


R 

log(R) 

P 

Strength of evidence 

1 3 

0- 

1 

0.5 - 

1. 0.75 

Not worth more than a bare mention 

3 ^ 20 

1 - 

^ 3 

0.75 - 

0.95 

Positive 

20 ^ 150 

3^5 

0.95 - 

0.99 

Strong 

> 150 

> 5 

>0.99 

Very strong 


5.1 PolyChord 

The nested sampling approach jSkilling||2004| l is a Monte-Carlo 
method targeted at the efficient calculation of the evidence, but 
also produces posterior inferences as a by-product. Recently a new 
nested sampling algorithm, Polychord ( [Handley et al.|20T5) , was 
introduced that makes evidence calculation in hundreds of dimen¬ 
sions a tractable process. The Polychord algorithm makes use of 
‘slice sampling’ jNeal|2000t . In one dimension, given a likelihood 
La, a point x is within the ‘slice’ if L(x) > La. Starting from a seed 
point Xq, sampling boundaries are set by expanding a random initial 
bound of width w until L(xo ± w) < Lq. A new point xi is then ob¬ 
tained within the slice by sampling uniformly within these bounds. 
If xi is not in the slice, w is decreased and xi is drawn again from 
these new bounds. 

In d dimensions. Polychord first ‘whitens’ the parameter 
space, performing a linear skew transformation which turns degen¬ 
erate contours in the original parameter space into into ones with 
dimensions 0(1) in all directions. An initial live point is then cho¬ 
sen randomly with coordinates in this whitened space given by Xq. 
A random initial direction is then chosen fio, and one dimensional 
slice sampling is performed in that direction to generate a new point 
Xi. This process is repeated 0(d) times to generate a new uniformly 
sampled point x^ which is decorrelated from the initial point Xq. 

In the analysis presented in this work we will be dealing with 
models that have up to ~ 100 free parameters, as such Polychord is 
a vital tool for performing robust model comparison in the profile 
domain. 


6 DATASET 

We perform our analysis, and construct simulations that are based 
off the 10cm data set for PSR J1909-3744 that was previously pre¬ 
sented in S15. Fig. 1^ shows the post-fit residuals for the 322 TOAs 
in this data set, observed with a total of 4 systems over the course 
of 10.8 yr. 

These data were recorded at a centre frequency of 3100 MHz, 
with 512-1024 MHz of bandwidth. Over the 10.8 yr of observa¬ 
tion, data were recorded with a number of different spectrometers. 
The improved computing power of the later spectrometers enabled 
the implementation of polyphase digital filters, and an increased 
number of spectral channels and pulse phase bins. These data were 
processed using procedures described in jManchester et ^ ( |2013| (, 
which we will describe in brief below, in order to draw comparison 
with the methods presented in this work. 

Individual observations were first averaged fully in both time 
and frequency to form a single pulse profile for each observation. 
Pulse TOAs were then measured by convolving model templates in 
the Fourier Domain with the total intensity pulse profiles for each 
observational epoch. Model templates were constructed by sum¬ 
ming a series of von Mises functions, each of which is described 
by 3-parameters, a phase, an amplitude, and a width. Initial guesses 
for component locations, amplitudes and widths were added by eye 
and then fitted using a non-linear algorithm. The residuals of the 
fit were inspected to determine if the fit was good or if additional 
components were needed, and the processes iterated upon until the 
residuals appeared to have white-noise characteristics. 

Because of its relatively simple morphology jPai et al.|20T5l ( 
only three components were needed to model the pulse profile, 
however, as in this work, the weak interpulse was not included in 
the model. Templates were formed for each set of systems that had 
markedly different backend architectures, in order to deal with pro¬ 
file distortions that might arise from the time, and frequency sam¬ 
pling for the different systems. Pulse-profile features narrower than 
a pulse bin will be unresolved. Similarly, dispersive smearing can 
broaden a pulse if frequency channels are wide (and, like the sys¬ 
tems used here, coherent dedispersion is not employed) relative to 
the intra-channel dispersion, however, consistent timing solutions 
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were found if one template was used for all the backends, and at 10 
cm this pulsar is not affected by dispersive smearing. 

Between backends discrete phase offsets (referred to as jumps) 
occur because of different propagation times through the digital 
systems. These offsets can also occur within a backend when it, 
or the observatory clock-distribution system experiences significant 
hardware changes. While it is possible in principle to measure rel¬ 
ative offsets between backends independent of the pulsar observa¬ 
tion, (by for example injecting a common signal into multiple back¬ 
ends and measuring the delay), these systems have yet to demon¬ 
strate the accuracy necessary to correct the highest-precision timing 
observations. As a result, we include as free parameters in our anal¬ 
ysis offsets between all of the backends, and, for the Parkes Digital 
Filterbank (PDFB) 4, at MJD 55319 when hardware changes re¬ 
defined the location within the signal processing at which a time- 
stamp was assigned, resulting in the definition of the PDFB4a sys¬ 
tem after this jump date. 


7 SIMULATIONS 

We construct four simulations using a fiducial timing model 
for PSR J1909-3744 consistent with published values (e.g. |van| 
[Haasteren et al.| j2011[ >). In all cases we use a simple Gaussian 
model for the profile, and use the data set described in Section]^ 
to determine the observed flux at each epoch, as well as the base¬ 
line offset. We then include different levels of the radiometer noise, 
systematic pulse jitter, and additional non-stationary profile com¬ 
ponents to the simulations to test the efficacy of our profile domain 
jitter and stochasticity models described in Sections |3.2| and 
For each simulation we process the profile data in the same way as 
described in Section|^to form TOAs against which to compare our 
profile domain analysis. 

In simulations 1 and 2 we include radiometer noise consistent 
with the levels determined from the real data set. In simulation 2 
we then include an additional systematic jitter term to the Wide 
Band Correlator (WBCORR, |Manchester et al.|2013^ profiles with 
an amplitude of 5x 10“^ s, chosen to be consistent with the value ob¬ 
served in the real data set when performing a TOA domain analysis. 
In simulations 3 and 4 we include a factor 10 times less radiometer 
noise than the real data set, and in simulation 4 we include an addi¬ 
tional Gaussian component to the profile, that has a width 0.1% the 
pulse period, and 1 % the amplitude of the profile for any given ob¬ 
servational epoch. The component moves deterministically through 
the main pulse profile over the course of 4000 days. We use this for 
our simulation rather than including stochasticity in the same basis 
as our model in order to check the efficacy of our method when it 
is attempting to correct for epoch to epoch profile variations that 
are not well described by zero mean Gaussian fluctuations in the 
amplitude of the profile components. 

In Fig. we show the residuals for all simulations after sub¬ 
tracting the fiducial timing model. The additional jitter in the WB- 
CORR system is apparent in the early data for simulation 2, as is 
the timing noise induced by the deterministic passage of an addi¬ 
tional Gaussian component through the pulse profile for simulation 
4. 


7.1 Simulations 1 and 2 

In order to compare the profile and TOA domain analysis for sim¬ 
ulations 1 and 2, we define the following models: 


• TOA Model 1: Timing model only. 

• TOA Model 2: Timing model and additional EFAC and 
EQUAD parameters for each observing system. 

• Profile Model 1: Deterministic profile and timing model only. 

• Profile Model 2: Deterministic profile, timing model and ad¬ 
ditional EEAC and jitter parameters for each observing system. 

In both profile domain models we include only the n = 0 
shapelet coefficient in our analysis. We find that the timing model 
parameter estimates and uncertainties for equivalent models in both 
simulations are completely consistent, as we would expect, as the 
assumptions about the stochastic properties in the profile domain 
were propagated correctly into the TOA analysis. 

In simulation 2 we find that the profile domain jitter model 
results in superior constraints on the high frequency variation in 
pulse arrival times compared to the TOA domain EQUAD model. 
In Eig. 1^ (Top) we show a simulated profile for a single observa¬ 
tional epoch from simulation 2, for the WBCORR system, to which 
we applied the additional systematic jitter signal. In the left panel 
we show the simulated profile (blue line), the model profile, evalu¬ 
ated at the arrival time predicted by the maximum likelihood timing 
model from our analysis (green line), and the maximum likelihood 
jitter profile (red line). In the right panel we have subtracted the 
model profile from the data, and the residual is clearly consistent 
with our jitter profile. 

In Fig. 1^ (bottom left) we compare the one and two- 
dimensional posterior parameter estimates for the EFAC and 
EQUAD parameters for the WBCORR system obtained in the TOA 
domain analysis (solid lines in the one-dimensional plots, solid 
contours in the two-dimensional plot), with the EEAC and jitter pa¬ 
rameter estimates from our profile domain analysis (dashed lines 
in the one-dimensional plots, filled dashed contours in the two- 
dimensional plot). The most striking feature of this comparison 
is the difference in constraints placed on the EFAC parameters in 
both domains. In the profile domain we still have the off pulse re¬ 
gion to constrain our estimates of the radiometer noise, informa¬ 
tion that is no longer available when we move to the TOA domain, 
leading to an order of magnitude decrease in precision when the 
EFAC parameter is estimated as part of the analysis. As a by prod¬ 
uct the constraints on the jitter amplitude in the profile domain are 
improved over the TOA domain EQUAD counterpart. This is be¬ 
cause in the TOA domain, the EEAC and EQUAD parameters are 
correlated (indicated by the ellipsoidal contours). In the profile do¬ 
main this is not the case, as seen in the bottom right panel, the jitter 
amplitude, and EFAC parameters are completely decorrelated, as a 
shift in the arrival time of the profile does not look like an increase 
in the radiometer noise, however in the TOA domain, a multiplica¬ 
tive prefactor (EFAC), and an additional term added in quadrature 
(EQUAD), can be completely covariant depending on how similar 
the error bars are across different observations. 


7.2 Simulations 3 and 4 

For simulations 3 and 4 we consider the following models: 

• TOA Model: Timing model, EQUAD parameters for each ob¬ 
serving system, and additional red timing noise. 

• Profile Model 1: Deterministic profile, timing model and ad¬ 
ditional red timing noise. 

• Profile Model 2: Deterministic profile, timing model, high and 
low frequency stochastic profile parameters, and additional red tim¬ 
ing noise. 
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Figure 6. Post-fit residuals for PSR J1909-3744 simulation 1 (top left), 2 (top right), 3 (bottom left), and 4 (bottom right) after subtracting the maximum 
likelihood timing model from a TOA domain analysis in each case. Simulations 1 and 2 include radiometer noise consistent with the levels determined from 
the real data set, and in simulation 2 we include an additional systematic jitter term to the WBCORR profiles with an amplitude of 5 x 10“^ s, chosen to be 
consistent with the value observed in the real data set when performing a TOA domain analysis. Simulations 3 and 4 include a factor 10 times less radiometer 
noise than the real data set, and in simulation 4 we include an additional Gaussian component to the profile, that has a width 0.1% the pulse period, 1% the 
amplitude of the profile for any given observational epoch, and which moves deterministically through the main pulse profile over the course of 4000 days. 


Additionally, for both simulations and for all 3 models we ob¬ 
tain upper limits on the amplitude of a steep red noise process with 
a spectral index of y = 13/3, consistent with that expected for a 
gravitational wave background formed from a population of inspi- 
ralling super-massive black hole binaries, jPhinney|2001) . 

As for simulations 1 and 2, we find that for both simulations 
3 and 4 that the timing model parameters are consistent between 
all 3 models, indicating that, as for the jitter parameters included 
in simulations 1 and 2, including the stochastic profile parameters 
in our analysis does not adversely affect the precision that we can 
achieve when no such stochasticity is present in the data. In addi¬ 
tion, in Fig. (left plot) we find the limits obtained on a steep red 
noise process within the data set are also consistent between the 
three models, with 95% upper limits of A < 1 x 10“*^ in all cases. 

For simulation 4, however, we find that the profile domain 
analysis results in a factor 2 improvement in upper limit when in¬ 
cluding profile stochasticity, compared to either the standard TOA 
domain analysis or the equivalent analysis using Model 1 in the pro¬ 
file domain. We show this explicitly in Fig.[^(right plot), in which 
we plot the upper limits for the same 3 models applied to simulation 


4. While TOA Model 1 (black line), and Profile Model 1 (red line) 
are consistent, with 95% upper limits of A < lx 10“*^, we find a 
factor 2 improvement in the upper limit when incorporating profile 
stochasticity in our analysis (blue line), obtaining A < 5 x 10“*®. 
The only difference between simulations 3 and 4 was the inclusion 
of an additional non-stationary Gaussian component to the profile 
model, that moved through the profile over a period of 4000 days. 

Despite having an amplitude of only 1% that of the profile in 
any observational epoch, the effect on our TOA domain analysis 
was to increase the upper limit by an order of magnitude. While 
our stochastic profile domain model does not fully recover the up¬ 
per limit obtained from simulation 3, we purposefully simulated 
non-stationary behaviour in the pulse profile that was not well de¬ 
scribed by a random Gaussian process. Despite this, we still obtain 
a significantly improved limit compared to the TOA domain analy¬ 
sis. 

The key result both from this comparison, and from the results 
of the jitter analysis in simulations 1 and 2, is that by performing 
the analysis using the profile data, rather than the SATs as is typ¬ 
ical in pulsar timing analysis, we can decorrelate profile variation 
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Figure 7. (Top left) Simulated profile (blue line) for a single observational epoch from simulation 2, for the WBCORR system, to which we applied the 
additional jitter signal. Over plotted are the maximum likelihood model profile, evaluated at the anival time predicted by the maximum likelihood timing 
model from our analysis (green line), and the maximum likelihood jitter profile (red line). (Top right) As in the previous panel, however we have subtracted 
the model profile from the data, leaving a residual that is clearly consistent with our jitter model. (Bottom left) One and two-dimensional posterior parameter 
estimates for the EFAC and EQUAD parameters for the WBCORR system obtained in the TOA domain analysis (solid lines in the one-dimensional plots, solid 
contours in the two-dimensional plot), with the EEAC and jitter parameter estimates from our profile domain analysis (dashed lines in the one-dimensional 
plots, filled dashed contours in the two-dimensional plot). The uncertainty in the EFAC parameter in the profile domain is an order of magnitude smaller 
than in the TOA domain, as we can use the statistical information present in the off pulse region to help constrain our estimates of the radiometer noise. The 
constraints on the jitter amplitude in the profile domain are also improved compared to the TOA domain EQUAD counterpaif. This is because in the TOA 
domain, the EFAC and EQUAD parameters are conelated (indicated by the ellipsoidal contours). In the profile domain this is not the case, as seen in the 
bottom right panel, the jitter amplitude, and EFAC parameters are completely decorrelated in our profile domain analysis. 


from the GW signal and significantly improve both our sensitivity 
to GWs, and our understanding of the stochastic processes present 
in the data. As models for profile variation improve, so too will 
the upper limits on GWs obtainable from a profile domain analy¬ 
sis. Given the covariance between the signal induced in the timing 
residuals from GWs, and from long-term temporal profile variation 
(see Fig.|^, however, only by performing a simultaneous fit can ro¬ 
bust estimates of both signal components be obtained, without one 
absorbing the other. 

We find the amplitude of both the high and low frequency 
stochastic models are consistent with the properties of the addi¬ 
tional profile component added to simulation 4. In Fig. we show 
the one dimensional marginalised posterior parameter estimates for 
both the high frequency (top panel) and low frequency (bottom 
panel) stochastic profile parameters from our analysis of simula¬ 
tion 4. We included separate scaling factors, j8, describing the high 
frequency profile variation for each observing system. Values are 


quoted as the fraction of the total profile amplitude added to the 
uncorrelated variance for each bin. For all systems we find the frac¬ 
tional increase in the uncorrelated noise is < 1 %, however the value 
for different systems varies by a factor ~ 2 relative to one another. 
Given each system samples a different period of time (See Fig.[^ 
this could be the result of the non-stationary nature of the additional 
Gaussian component added to the profile data. 

For the low frequency profile variations we show either the 
mean parameter estimates and Icr confidence intervals (points with 
error bars), or 95% upper limits (arrows). Upper limits are plotted 
where the posterior for that scale is consistent with zero at greater 
than 5% probability. We find that only stochastic variations in the 
n = (2,3,4) profile coefficients are significant. Given the width 
of the profile, A = 0.007 in phase, which corresponds to ~ 7 phase 
bins in the more modern observing systems, beyond n = 4 the scale 
of the fluctuations in the stochastic model will be < 2 bins, and so 
we would expect these to become increasingly covariant with the 
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Figure 8. One-dimensional marginalised posterior parameter estimates for the amplitude of a steep red noise process with a spectral index of y = 13/3 for 
simulation 3 (left plot) and simulation 4 (right plot). Both the TOA domain and profile domain analyses for simulation 3 result in consistent 95% upper limits 
of 1 X 10^*®. In simulation 4 we find that the TOA domain analysis, and a profile domain analysis that includes a red noise signal but no profile stochasticity 
results in consistent 95% upper limits of 1 X 10“*^. Despite the fact that the additional Gaussian component was only 1% of the profile amplitude, and 0.1% 
of the pulse width, it has a significant impact on the upper limit, increasing it hy an order of magnitude. Our profile domain analysis that includes profile 
stochasticity (blue dashed line) significantly improves upon the TOA domain analysis, with a 95% upper limit of 5 X 10“*®. While this is still much greater 
than for the data set that included no profile variation it still demonstrates clearly that by modelling these effects in the profile domain they can be better 
isolated from the GWB signal of interest. 


high frequency stochastic parameters, and thus less significant in 
our analysis. 


8 APPLICATION TO THE PSR JI909-3744 DATA SET 

We now apply our profile domain framework to the PSR 
J1909-3744 data set described in Section [b] As with the simula¬ 
tions we will compare both TOA and profile domain models, defin¬ 
ing: 

• TOA Model 1: Timing model only. 

• TOA Model 2: Timing model, and additional EFAC and 
EQUAD parameters for each observing system. 

• Profile Model 1: Deterministic profile, and timing model only. 

• Profile Model 2: Deterministic profile, timing model, and ad¬ 
ditional EEAC and jitter parameters for each observing system. 

• Profile Model 3: Deterministic profile, timing model, and ad¬ 
ditional EEAC, jitter and high frequency stochastic profile parame¬ 
ters for each observing system. 

• Profile Model 4: Deterministic profile, timing model, EFAC 
parameters for each observing system and high and low frequency 
stochastic profile parameters. 

For each model we find the optimal number of shapelet coeffi¬ 
cients to include by increasing the number of included coefficients 
in steps of 5 until the change in the log evidence is less than 3. Table 
1^ lists the optimal number of coefficients included in each model, 
along with the log evidence values for that model. 

We find that Profile model 1, which includes no additional 
stochastic parameters requires the largest number of components 
(40), while subsequent models that include either profile jitter or 
profile stochasticity require 25% fewer components to describe the 
data. Initially this may seem like a large number compared to the 
3 von Mises functions used in the standard analysis, however in 


Table 2. Evidence values for different models fit to the PSR J1909-3744 
data set. 


Model 

Nc 

logZ 

TOA Model 1 

- 

0 

TOA Model 2 

- 

35.7 

Profile Model 1 

40 

0 

Profile Model 2 

30 

154.2 

Profile Model 3 

30 

1302.7 

Profile Model 4 

30 

1280.5 


this case a separate profile model is fit to the WBCORR, PDFBl, 
PDFB2, and PDFB4 systems. Given a set of n von Mises functions 
require 3n-l parameters (n amplitudes, n widths and n — I relative 
positions), this means that for the total data set 32 parameters were 
used to describe the profile shape, which is consistent with our re¬ 
sults. 

We find timing model parameter estimates and uncertainties 
consistent between TOA model 1 and profile model 1, and between 
TOA model 2 and profile models 2-4. Additionally, as in simula¬ 
tion 2, we find that the inferred amplitude of the profile domain jit¬ 
ter model is consistent with the TOA domain EQUAD model. Fig. 
[T^(top plot) shows a comparison of the one and two-dimensional 
posteriors for the TOA EQUAD model (solid lines in the Id plots, 
contours in the 2d plot) with the profile domain jitter model (dashed 
lines in the Id plots, filled contours in the 2d plot) for the WB¬ 
CORR system. As in the simulations we find the EEAC parame¬ 
ters are better constrained in the profile domain by approximately 
an order of magnitude, and show no observed correlation with the 
jitter parameters leading to better overall constraints being placed 
on their amplitudes. The EFAC parameter that scales the uncor- 
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Figure 9. (Top) One-dimensional posterior parameter estimates for the am¬ 
plitude of the high frequency stochastic element to the profile for Simulation 
4. Separate amplitudes were fit to each observing system. The units of the 
abscissa are fraction of the total profile amplitude added to the uncorrelated 
variance for each bin. The amplitudes are consistent with < 1% of the pro¬ 
file amplitude for all systems however they show a factor ~ 2 spread relative 
to one another. Given each system samples a different period of time (See 
Fig.[^ this could be the result of the non-stationary nature of the additional 
Gaussian component added to the profile data. (Bottom) Mean values and 
Icr confidence intervals (red points with error bars), and 95% upper limits 
(black arrows) for low frequency stochastic variations in the profile. Upper 
limits are plotted where the posterior for that scale is consistent with zero 
at greater than 5% probability. 

related instrumental noise for the WBCORR system, however, is 
not consistent with a value of 1 in profile model 3, implying that 
the residuals are not consistent with our initial estimate of the in¬ 
strumental noise for this backend. This can be seen more clearly 
in the bottom panel of Fig. which shows a comparison of the 
one-dimensional marginalised posterior parameter estimates for the 
WBCORR EFAC parameter from profile models 2 (black line) and 
3 (red line). Assuming that the mean profile suffers only a shift from 
epoch to epoch as in profile model 3 (and in the standard timing 
analysis paradigm) results in an EFAC that is systematically greater 
than 1, implying that the model used is not sufficient to describe the 
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Figure 10. (Top) One and two-dimensional posterior pai'ameter estimates 
for the EFAC and EQUAD parameters for the WBCORR system obtained 
in the TOA domain analysis (solid lines in the one-dimensional plots, solid 
contours in the two-dimensional plot), with the EFAC and jitter parame¬ 
ter estimates from our profile domain analysis (dashed lines in the one¬ 
dimensional plots, filled dashed contours in the two-dimensional plot). As 
in simulation 2, we find that the inferred amplitude of the profile domain 
jitter model was consistent with the TOA domain EQUAD model. We sim¬ 
ilarly see that as before the EFAC parameters are better constrained in 
the profile domain by approximately an order of magnitude, with no ob¬ 
served correlation with the jitter parameters, leading to better overall con¬ 
straints being placed on their amplitudes. (Bottom) Comparison of the one¬ 
dimensional mai'ginalised posterior parameter estimates for the WBCORR 
EFAC pai'ameter from profile models 2 (black line) and 3 (red line). As¬ 
suming that the mean profile suffers only a shift from epoch to epoch as in 
profile model 3 (and in the standard timing analysis paradigm) results in an 
EFAC that is systematically greater than 1, implying that the model used 
is not sufficient to describe the on pulse region. This is confirmed by the 
Bayesian evidence, which is significantly greater (A log .3 > 1000) for a 
model that includes high frequency, systematic profile stochasticity, which 
results in an EFAC that is consistent with 1. 

on pulse region. This is confirmed by the Bayesian evidence, which 
is significantly greater (A log 3 > 1000) for a model that includes 
high frequency, systematic profile stochasticity, which results in an 
EFAC that is consistent with 1. In addition, when including a high 
frequency stochastic profile model simultaneously with the jitter 
parameters as in model 3, we find the significant EQUAD term de¬ 
tected in the TOA domain, and the jitter term seen in profile model 
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Figure 11. Example of two WBCORR profiles that suffer from non- 
Gaussian ‘sawtooth’ noise. Approximately 10% of WBCORR points show 
this effect by eye, however the amplitude varies from profile to profile. In 
the TOA domain this manifests itself as an EQUAD term - high frequency 
variation in the arrival times of the pulses. In the profile domain we find the 
Bayesian evidence is significantly in support (AlogJ3 > 1000) of a model 
where the pulse profile in the WBCORR data has a high frequency stochas¬ 
tic component. When fitting simultaneously for a jitter model, and for a high 
frequency stochastic element, we found the jitter term to be consistent with 
zero, suggesting the EQUAD detected in the TOA domain originates in this 
process. While clearly the optimal solution would be to simply fix such is¬ 
sues at the level of forming the folded data, where that is not possible by 
working in the profile domain one can better model such systematic effects 
and separate them from ‘true’ shifts in the arrival time due to pulse jitter. 


2 is consistent with zero, implying that all of the additional high 
frequency noise observed in the TOA domain is due to this high 
frequency stochastic term. 

The origin of this high frequency noise can clearly be seen in 
Fig. [m which shows two example of profiles from the WBCORR 
system that suffer from non-Gaussian ‘sawtooth’ noise. Approxi¬ 
mately 10% of WBCORR points show this effect by eye, however 
the amplitude varies from profile to profile. While clearly the op¬ 
timal solution would be to simply fix such issues at the level of 
forming the folded data, where that is not possible by working in 
the profile domain one can better model such systematic effects and 
separate them from ‘true’ shifts in the arrival time due to pulse jit¬ 
ter. 

Finally, we find no evidence for low frequency stochasticity in 
the PSR J1909-3744 data set using profile model 4. We show the 
95% upper limits from our joint analysis for the 20 lowest order 
shapelet components in Fig. with units given in terms of the 
fraction of total power in the profile. Upper limits range from 1- 
10% of the total power in the profile. 

8.1 Upper limits on an isotropic stochastic GWB 

As a final comparison, we take TOA model 2, and profile model 

3 and include a red noise model for each. In both cases we ob¬ 
tain upper limits on the amplitude of a steep red noise process with 
a spectral index of y = 13/3. Fig. |13| shows the one-dimensional 
marginalised posterior parameter estimates for the amplitude of the 
red noise process for the TOA domain (black line), and profile do¬ 
main (red line) analysis. Both are consistent with one another, with 
95% upper limit of 1 x 10“'^. Given that we found no evidence for 
low frequency profile stochasticity in the data, and only the older 
WBCORR system, which contributes relatively little weight to the 
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Figure 12. 95% upper limits on the fractional profile stochasticity in the 
PSR J1909-3744 data set described in Section obtained using profile 
model 4. Red arrows represent upper limits from the full analysis, black 
arrows represent upper limits when fixing both the timing model, and mean 
profile model to their maximum likelihood values obtained from profile 
model 3. Upper limits range from 1-10% of the total power in the profile. 
When fixing the timing model and mean profile to their maximum likeli¬ 
hood values we find erroneously stringent upper limits by a factor of up to 
~ 1 . 8 . 



Figure 13. One-dimensional marginalised posterior parameter estimates for 
the amplitude of a red noise process in the PSR J1909-3744 data set de¬ 
scribed in Section[^at a spectral index of 13/3 for (black line) a TOA do¬ 
main analysis using TOA model 2, and (red line) a profile domain analysis 
using profile model 3. We find both ai‘e consistent with a 95% upper limit of 
1 X 10“^^, consistent with values reported in S15. In this data set we found 
no evidence of low frequency profile stochasticity, and only the older WB¬ 
CORR system was significantly affected by high frequency systematics, so 
it is not surprising that upper limits using the optimal models for both the 
TOA and profile domain analyses are consistent with one another. 


full data set, was significantly affected by high frequency system¬ 
atics, it is not surprising that upper limits using the optimal models 
for both the TOA and profile domain analyses are consistent with 
one another. 
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9 CONCLUSIONS 

In this paper we have extended the Bayesian framework ‘Genera¬ 
tive Pulsar Timing Analysis’ to incorporate both pulse jitter (high 
frequency variation in the arrival time of the pulse) and epoch to 
epoch stochasticity in the shape of the pulse profile. This frame¬ 
work allows for a full timing analysis to be performed using the 
folded profile data, rather than the site arrival times as is typical in 
most timing studies. 

We first used simulations to show that while in a standard TOA 
domain analysis of pulsar timing data the effect of profile stochas¬ 
ticity and the signal induced by a gravitational wave background 
are highly covariant, these two signals can be decorrelated when 
performing the analysis in the profile domain, resulting in a sig¬ 
nificantly improved upper limit on a stochastic gravitational wave 
background. We showed that this was also true for high frequency 
variation in the arrival time of the pulse. In the TOA domain this 
is typically modelled using an ‘EQUAD’ parameter, that adds in 
quadrature to the TOA uncertainties. This is highly covariant with a 
scaling of the error bars (using ’EFACs’), which is typically used to 
model uncertainty in the value of the radiometer noise. In the pro¬ 
file domain a model for pulse jitter is completely decorrelated from 
the model for the thermal radiometer noise, resulting in improved 
constraints on the magnitude of the effect in the pulsar timing data. 

Our simulation included temporal variation in the pulse profile 
in the form of an additional Gaussian component with an amplitude 
1% that of the main profile, and a width 0.1% of the pulse period. 
Despite the relatively small amplitude of the additional signal, this 
resulted in an order of magnitude increase in the upper limit ob¬ 
tained in the TOA domain analysis. The techniques described in 
this paper allow for the simultaneous evaluation of both contribu¬ 
tions to the data, resulting in robust upper limits, and eventual de¬ 
tection of a GWB. 

We also applied our methodology to a PSR J1909-3744 10 cm 
data set and find significant evidence for systematic high-frequency 
profile variation resulting from non-Gaussian instrumental noise in 
the oldest observing system in the analysis. We find that the evi¬ 
dence supports this description of the data over a jitter model where 
the variation is modelled simply as a shift of the arrival time of the 
mean profile. We find no evidence for either detectable pulse jitter, 
or low-frequency profile variation, and set limits on the latter of be¬ 
tween 1 and 10% for the different profile model components. Using 
our profile domain framework we obtain upper limits on a red noise 
process with a spectral index of y = 13/3 of 1 x 10“'^, consistent 
with previously published limits. 

With current upper limits for a stochastic GWB now begin¬ 
ning to rule out existing models for black hole binary populations, 
profile variability at the <1% level can become an important limit¬ 
ing factor in the sensitivity achievable in pulsar timing experiments. 
By moving away from the standard timing paradigm, where a tim¬ 
ing model is fit to a set of site arrival times, and instead working 
directly with the profile data, these effects can be modelled appro¬ 
priately, and the covariance with a possible background, or other 
signals of interest, can be significantly decreased. As upcoming ra¬ 
dio telescopes such as the Square Kilometre Array come on-line, 
improvements in sensitivity will increase the significance of these 
effects on the pulse arrival times as measured through traditional 
methods, making a profile domain analysis the natural choice in or¬ 
der to improve the prospects for the detection of gravitational waves 
with pulsars. 

Finally, the methodology we have described can be easily ex¬ 
tended to account for shape variations associated with geodetic pre¬ 


cession in relativistic systems, such as the Hulse-Taylor Binary and 
the double pulsar system, in which pulsars show dramatic secular 
shape variability. While these shape variations are not stochastic, 
by accounting for shape variability simultaneously to timing, a ro¬ 
bust characterization of the relativistic orbit can be obtained, and 
the precision of tests of general relativity will be increased. 


10 ACKNOWLEDGEMENTS 

The Parkes radio telescope is part of the Australia Telescope Na¬ 
tional Facility which is funded by the Commonwealth of Australia 
for operation as a National Facility managed by Commonwealth 
Science and Industrial Research Organization (CSIRO). We thank 
all of the observers, engineers, and Parkes observatory staff mem¬ 
bers who have assisted with the observations reported in this paper. 

LL was supported by a Junior Research Fellowship at Trin¬ 
ity Hall College, Cambridge University. RMS acknowledges travel 
support from the CSIRO through a John Philip award for excellence 
in early career research. 


REFERENCES 

Coles W., Hobbs G., Champion D. J., Manchester R. N., Verbiest 
J. P. W., 2011, Mon. Not. R. Astron. Soc., 418, 561 
Dai S., et ah, 2015, Mon. Not. R. Astron. Soc., 449, 3223 
Demorest P. B., et ah, 2013, The Astrophys. J., 762, 94 
Detweiler S., 1979, The Astrophys. J., 234, 1100 
Handley W. J., Hobson M. P, Lasenby A. N., 2015, ArXiv e-prints 
Hankins T. H., Cordes J. M., 1981, The Astrophys. J., 249, 241 
Hobbs G., Lyne A. G., Kramer M., 2010, Mon. Not. R. Astron. 
Soc., 402, 1027 

Hotan A. W., Bailes M., Ord S. M., 2004, Mon. Not. R. Astron. 
Soc., 355, 941 

Hotan A. W., Bailes M., Ord S. M., 2005, Mon. Not. R. Astron. 
Soc., 362, 1267 

Hotan A. W., Bailes M., Ord S. M., 2006, Mon. Not. R. Astron. 
Soc., 369, 1502 

Jenet F. A., Anderson S. B., 1998, PASP, 110, 1467 
Kass R. E., Raftery A. E., 1995, Journal of the American Statisti¬ 
cal Association, 90, 791 

Kelly B. C., McKay T. A., 2004, The Astrophys. J., 127, 625 
Lentati L., Alexander P, Hobson M. P, 2015, Mon. Not. R. As¬ 
tron. Soc., 447, 2159 

Lentati L., Alexander P., Hobson M. P, Feroz R, van Haasteren 
R., Lee K. J., Shannon R. M., 2014, Mon. Not. R. Astron. Soc., 
437, 3004 

Lentati L., Carilli C., Alexander P, Maiolino R., Wang R., Cox P, 
Downes D., McMahon R., Menten K. M., Neri R., Riechers D., 
Wagg J., Walter R, Wolfe A., 2013, Mon. Not. R. Astron. Soc., 
430, 2454 

Lentati L., Hobson M. P, Alexander P, 2014, Mon. Not. R. As¬ 
tron. Soc., 444, 3863 

Liu K., Desvignes G., Cognard I., Stappers B. W., Verbiest 
J. P. W., Lee K. J., Champion D. J., Kramer M., Rreire P. C. C., 
Karuppusamy R., 2014, Mon. Not. R. Astron. Soc., 443, 3752 
Liu K., Karuppusamy R., Lee K. J., Stappers B. W., Kramer M., 
Smits R., Purver M. B., Janssen G. H., Perrodin D., 2015, Mon. 
Not. R. Astron. Soc., 449, 1158 

Lyne A., Hobbs G., Kramer M., Stairs I., Stappers B., 2010, Sci¬ 
ence, 329, 408 


© 0000 RAS, MNRAS 000, 000-000 



16 L. Lentati et al. 


Manchester R. N., et al., 2013, PASA, 30, 17 
Neal R. M., 2000, ArXiv Physics e-prints 
Pennucci T. T., Demorest P. B., Ransom S. M., 2014, The Astro- 
phys. J., 790, 93 

Phinney E. S., 2001, ArXiv Astrophysics e-prints 
Refregier A., 2003, Mon. Not. R. Astron. Soc., 338, 35 
Refregier A., Bacon D., 2003, Mon. Not. R. Astron. Soc., 338, 48 
RickettB J., 1969, 221, 158 

Romani R. W., Johnston S., 2001, The Astrophys. J., 557, L93 
Sallmen S., Backer D. C., Hankins T. H., Moffett D., Lundgren S., 
1999, The Astrophys. J., 517, 460 
Sazhin M. V., 1978, Soviet Astronomy, 22, 36 
Shannon R. M., et al., 2014, Mon. Not. R. Astron. Soc., 443, 1463 
Shannon R. M., Ravi V., Lentati L. T., Lasky P. D., Hobbs G., Kerr 
M., Manchester R. N., Coles W. A., Levin Y., Bailes M., Bhat N. 
D. R., et. al. 2015, Science, 349, 1522 
Shao L., Caballero R. N., Kramer M., Wex N., Champion D. J., 
Jessner A., 2013, Classical and Quantum Gravity, 30, 165019 
Skilling J., 2004, in Fischer R., Preuss R., Toussaint U. V., eds, 
American Institute of Physics Conference Series Vol. 735 of 
American Institute of Physics Conference Series, Nested Sam¬ 
pling. pp 395-405 

Taylor J. H., 1992, Royal Society of London Philosophical Trans¬ 
actions Series A, 341, 117 

van Haasteren R., et al., 2011, Mon. Not. R. Astron. Soc., 414, 
3117 

van Haasteren R., Levin Y, 2013, Mon. Not. R. Astron. Soc., 428, 
1147 

Wolszczan A., Cordes J., Stinebring D., 1984, in Reynolds S. P, 
Stinebring D. R., eds. Birth and Evolution of Neutron Stars: Is¬ 
sues Raised by Millisecond Pulsars a Single Pulse Study of the 
Millisecond Pulsar 1937-1-214 (Poster), p. 63 


© 0000 RAS, MNRAS 000, 000-000 



